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ABSTRACT 

A sumraative evaluation design vas developed as a 
framework tor evaluating mstr ctional material*^ in remedial reading* 
The paradigm includes the selection of (1) relevant variables lor 
study and (2) the method of study. Two types cf reading materials 
used in Chicago schools were studicd--Cracking the Code (CTC) and the 
Mott Semi-Programmed Series in Language Skills (MLS) , Random 
procedures were used to select the 36 classrooms studied (two 
classrooms at each of the fifth-, sixth-, and seven t h-grade levels 
from schools in each of six school districts representing high, 
middle, and low socioeconomic levels)* Teachers in these classrooms 
were randomly assigned to one of the programs for 1 month and asked 
to use the programs as supplements to regular instruction* Pretesting 
and post-testing results were compared* Among the conclusions were 
(1) that program effects are multiple, (2) that differences based on 
socioeconomic levels vary at different grade levels, and (3) that no 
simple decisions are possible regarding which of the programs is 
superior. Tables of analysis of Variance results and references h re 
included* (MS) 
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I . General Background 

A serious and growing need Is developing In American education for 
evaluation of educational programs, from the level of the specific 
cextbook to the level of the reneral school system. The work cf the Center 
for the Study of Evaluation of Instructional Programs at UCLA, the American 
Educational Research Association’s sponsorship of monographs and symposia 
'»n problems In evaluation, and the extensive funding of local evaluation 
under the Elementary-Secondary Educational Act (Title III) each point up 
the Increased professional awareness of the need for sound evaluation 
methodology In education. A number of papers have been written concerning 
the nature of the problems In evaluation and appropriate research 
methodologies. Traditional psychometric achi uvemeni. testing, with its 
emphasis on Individual differences, has been challenged as a paradigm 
or theory for measurement in program evaluation (Gagne, 1967, 1968; Cronbach, 
1963; Tyler, L967 and Stake, 1967). Sci Iven (1967) has raised a number of 
additional questions about the nature of educationcl evaluation arrf , for 
example, has challenged the appropriateness of using only the classical 
criteria of research designs In which explanation is the primary goal. 

This so-celled "explanatory 11 research design Is a strategy espoused In 
the well known paper of Cronbach (1963). It Is in the context of such 
Intellectual controversy that Westbury’s (1970) finding, l.e. that 
actual curriculum evaluation research hag not been reported in the educa- 
tional literature, appears disappointing. It is hoped that the present re- 
search can contribute to tha development of our ability to do evaluation. 
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The research preseated here Is Intended co realize a meaningful 
and significant approach to the general problem of educational evalua- 
tion. This study was Intended to have value beyond simply knowledge of 
the specific curricula studied here, Remedial Reading Programs. Many 
basic and pervasive problems are encountered In any educational evalua- 
tion research and how they are dealt with must have an effect on the 
quality of the program evaluation. The design used for this evaluation 
was Intended to avoid some of those restrictions often encountered In 
educa:!cnal research and thereby allow for more direct and meaningful 
applications . 

The primary problem framework of this research can be stated rather 
simply. What Information does a classroom teacher or senoo 1 administra- 
tor need in order to decide which of several commercially available 
curriculum programs should be purchased and u3ed? Currently, the In- 
formation available Is quite Informal, e.g. the recommendation of teachers, 
the brochures or orientations given by sales representatives (usually 
rather devoid of facts) or simple common sense-experience which the 
teacher has acquired, Furtheraore, Individual teachers are rarely fixe 
to choose among all possible curriculum programs. Many states have state- 
adoption programs vhl h limit schooLs to use only those materials which 
have been officially adopted, In some cases the schoo) or school system 
has curriculum staff which serve a screening function for curriculum 
materials, or schools may arrive at group or administrative "policy'' 
decisions about what materials wtlL be acquired by the school Itself, 
from which the Individual teacher can then select. It Is Important to 
note that even when purchase decisions are school-based, the available 
Information for decisions Is still quite Informal and generally lntultlvs*. 
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The Ideal situation from the perspective of the administrator or 
teacher appears to be one In which each publisher would make available 
extensive information on program outcomes as well aa extensive cost 
Information, e.g. materials cost, teacher training cost, usage time 
requirements, etc. This Is not being done. Publishers contend that they 
have neither the financial wherewithal itor the technical skill to pro- 
vide aLL of this information. Wh^t are the alternatives? 

Although neither total cost nor outcome Information Is available* 
it Is clear that rhe more pressing need Is for facts about the outcomes 
or results of program usage. This type of Information must be available 
for rational decision making in education and, joined with cost data, 
forms the only intelligent basis for efficient allocation of educational 
resources (Alkir., 1969). A suggestion that school districts themseLves 
perform their own evaluation is not feasible. Wiley and Bock (L967) 
point out some of the relatively straightforward probLems encountered In 
this approach, primarily arising from the limited experimental control 
possible In a single district. Some obvious constraints involving the 
financial limitation of schooL districts, parental resistance to perva- 
sive and continual Innovation In the schools, and teacher resistance 
add to the List of such difficulties, 

A viable strategy for acquiring the necessary Information seems to 
require the participation of Independent Investigators. University faculty 
or research Institutes, supported primarily by numerous school districts 
In consort or Independently funded, can provide the required technical 
competence, objectivity and capacity to utilize multiple school districts 
In exploring program outcomes. This evaluation program was undertaken 
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to ’’test 11 , as it were, the viability of such a cooperative, inter- 
district model for curriculum evaluation. 

Scriven's (1967) conceptual framework provides the terms "sum.vative” 
and “pay-off 11 which can be used to describe the kind of evaluation needed 
for commercial programs, What is at the core of the decision problem 
from this perspective is the acquisition of knowledge concerning the 
outcomes or behavioral results due to the application of a curriculum 
program. This is a "blackbox 14 perspective in which performance or out- 
put is the focus rather than an attempt to provide a detailed explana- 
tory specification of instructional process as is apparently proposed 
by Cronbach (1963) or Bormuth (1969). However, it is not enough even 
to take sides on that issue. What is also needed in order to eval- 

uation reseaich is a working framework or paradigm which points at 1) 
relevant variables for study and 2) the method of study. The following 
considerations were used as such a framework, and provided a basis upon 
which the present investigation was designed. 

The Working Framework 

The Variables: 

1. The Educational Program (curriculum) analysis: 

a. What is the content? 

b. How is it used? 

c. Who is to use it? 

d. Who is to receive It? 

2, The measurement of outcome: 

a. What are the skills or knowledge directly taught in the 
program? 

b. What are the general skills or knowledge built upon the 
direct skills? 

O 




3. What are the properties of schools which may be related to program 
outcome or effects? 

4. Are there properties of students which may be related to program 
outcome or effects? 

5. What range or extent of applicability of information Is needed 
or desired? 

2) The Method: 

The most appropriate a n d direct method available for obtaining 
Information on the comparative effects of educational programs Is 
the experimental method. It allows the Investigator to actively 
manipulate and control different variables of Interest. The theory 
of experimental design, as developed by R. A. Fisher, is built speci- 
fically on a procedure called randomization. This procedure guarantees 
the validity of Inferences about the effect of Influences of experl* 
mental treatments. It should be dear that this property of infer- 
ences Is very badly needed In education evaluation. The experimental 
paradigm also, or perhaps primarily, has furnished an extensive basly 
for analyzing resultant data and making Inferences based on such data. 
There has been an excellent critique of the problems In the use of 
experimental design In education (Campbell and Stanley, 1963). Wiley 
and Cock (1967) show some aspects of at least one way randomization 
can b^ appropriately done, l.e. on the level of the classroom. 
Hopefully, demonstration of the application of experimental strategies 
to evaluation can facilitate the practice and development of curri- 
culum evaluation. 

The primary Importance of the questions listed under "Variables" 

In the Working Framework Is that answers to them can specify the relevant 
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aspects of an educational program in terms of the parameters which 
could effect program results. Given these properties of a program, en 
experimental design could be constructed to yield important information 
for use by the prospective decision maker. These considerations t ere 
specifically applied to the Remedial Reading programs evaluated in this 
study and will be reviewed below. 




7 



7 



Measurement of Program Effects 

There are two properties of any measurement procedure used tn 
evaluation which are critical. Both of these properties basically involve 
"validity" considerations, ar opposed to the usual concern with the statis- 
tical "reliableness" of measures. First, there must be an acceptable cor- 
respondence between the instructional content of the program(s) and the 
behavior or performance observed in the measurement procedure. The 
degree of this correspondence cannot be itself measured absolutely, but 
it can be judged qualitatively, Gagne’s (1969) terra dis t inc t iveners may 
well apply here, The second property c uld be referred to as complete - 
ness . What is of concern here is the scope or breadth of observations of 
phenomena which are "indirectly" related to the immediate content of the 
program(s) . For example, a measurement procedure which included 
"thought" questions based on an arithmetic program would be more com - 
plete than one which only included simple computational exercises. A 
classical learning paradigm would refer to these more M comp lete" obser- 
vations as measures of transfer or cf response genera llz&t ion. 

The area of instruction investigated here is that of reading. The 
remedial nature of these curriculum materials imposes a very significant 
additional constraint c the content of the programs. The emphasis or: the 
so-called "decoding*' process, i.e., generating phonetic representation of 
the written text, is evident in both of the programs studied here. The 
commonality and inclusion of such Letter-to*sound training is due to the 
belief that: 
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L. This skill is clearly a prerequisite tor the real business 
of reading, comprehension of meaning; 

2. Many children cannot read and comprehend meaning in written 
materials because they are not able to decodr from letter to 
sound; and 

3. Therefore, they must be trained A n that skilL. 

The reasonableness of the first and second parts of the above 
rationale is not co m ~letety known (Desberg and Berdiansky, 1968, Levin 
and Gibson, L968) . It could be argued that it is irrelevant to the evalua- 
tion measurement problem. What must be done, nevertheless, is measure- 
ment of these instructional skills because they are taught by the curri- 
culum and therefore relevant to evaluation. 

Two measurement devices were used which are related to the decoding, 

or word attack, skills. The Letter-Sound Correspondence Test (LSC) ^ 

2 

and the Bond-C lymer-Hoyt Silent Reading Diagnostic Tests (SRD) were 
chosen, not only for their dlst inctiveness, but for the fact that they 
are group administered tests, a very necessary attribute, The LSC tect 
is based on linguistic studies of English orthography and the basis for the 
measurement procedure is given in Venezky, ec al (1969), The SRD test 
can perhaps be described best by a list of the subtests used: 

L, Syllabication! 

7 Root Word Location, 



Under development by R. Venezky, R. Chapman, and R, Calfee of the 
Research and Development Center for Cognitive Learning at the University 
of Wisconsin. 
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3. Word Elements (Sound to Letter), 
Beginning Sounds (Sound to Letter), 
5* Rhyming Sounds (Sound to Letter), 

6* Letter Sounds (Sound to Letter). 



The second property of evaluation measurement, completeness, was 
realized here through the use of the Iowa Silent Reading Tests * (ISR), 

The ISR test is primarily a test of comprehension skills, although an 
analysis of the sources of infor»ivat ion used in the test shows the test 
to be quite complex (Bormuth, 1968). The test is group administered and 
a traditional, widely used test of reading. There are two primary reasons 
for including such a test in the evaluation measurement, First, effective 
comprehension of written material is the basic goal or target of reading 
Instruction; it is the final oehavioral objective* Therefore, in a funda- 
mental sense, no instructional program for reading, whether remedial or 
not, can be evaluated without some measurement of comprehension behavior. 
Second, the assumption common to both instructional programs, i.e*, the 
key role cl decoding/vord attack skills in remedial reading instri rs, 
forces one to go beyond the measurement of only letter-sound knowledge. 

The situation which must be avoided is one in which instructional effects 
are investigated on letter-sound knowledge, but there Is no evidence 
collected regarding instructional effects on comprehension skill through 
Improvement In letter-sound knowledge . The assumption made in the materials 
in order to arrive at a remedial program must not remain an assumption in 
evaluation, but become an hypothesis subject to empirical examination. 

That is. does improvement in letter-sound-correspondence knowledge lead 
to increases in comprehension skill? 
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Instructional Materials 



The commercial materials examined In this research are the Mott Semi - 
Programmed Series In Language Skills (M7.S) , published by the Allied 
Education Council, and Cracking the Cede (CTC) , published by Science 
Research Associates* A step which must be taken is an analysis of the 
content, methods and goals of the two programs. 

The Cracking the Code materials are designed to teach children 
basic letter-sound patterns using a deductive approach. This method, 
often called, "linguistic word attack,' 1 presents regular grapheme-phoneme 
correspondences In several words. It is hoped that, by practicing such 
words, the child will either! 

1. Discover the Letter-sound rules and then use them Inductively, 
or 

2. Figure out new words by analogy, ualng known patterns. 

The core of the program Is the workbook and Is divided into twelve 
sections, each with a corresponding section In the accompanying reader* 

The reader Is designed to provide practice using the word patterns that 
have been learned from the workbook. These patterns are Introduced ac- 
cording to their ’’frequency of occurrence In writing" and r, easc of discovery. 
Infrequent or difficult patterns are Introduced near the end of the program. 
However, many of the word patterns Introduced In the same lessen can be 
easily confused. For example, lessou 12 Introduces the patterns lght , 
lain elgh t ough t , and aught . 

Since reading Is defined as a process of decoding writing Into sound, 
word recognition skills are taught and vocabulary and comprehension skills 
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are ignored. But even within this restricted framework, problems outside 
of grapheme-phoneme correspondences have been handled superficially, 
where they have been handled at all. For example, the word recognition 
skills of syllabification and morphological division receive very little 
systematic attention. 

The book or readings accompanying the workbook presents words con- 
taining the sounds introduced in the workbook. The selections begin 
with quite easy words presented in an extremely " linguist ic" format ("it 
was odd to run into a bug in the lap of a Dupenpox on top of a hill") (p. 10), 
However, thay quickly progress into more conventional readings. There is 
no noticeable change in the difficulty of the vocabulary or syntax of the 
stories from page 30 to the end of the book (page 215 )* However, this is 
an impressionistic analysis; readability formulas have not been applied. 

The Mott Semi -Programmed Series in Language Skills materials are 
more eclectic in both content and approach. They include exercises in 
writing as well as in all phases of reading. Comprehension, vocabulary 
and word recognition skills are taught. The program includes many prac- 
tical applications of reading such as reading labels ana newspapers. The 
MLS materials may be divided into the first six and the last four books. 

The first six books teach letter-sound correspondences. They are roughly 
comparable to the CTC series. For the most part, 3n inductive 
approach is used. The last four books present extensive reading ^ 
vocabulary and more advanced exercises on ’.<ord recognition skills such 
as syllabification and morphology. 
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The MLS materials combine inductive and deductive methods for 
teaching grapheme -phone me correspondences. Typically, a word is intro- 
duced which contains the pattern to be taught. For example, If the "ase" 
pattern is to be taught, the word "case" is used. By changing the first 
letters, several words are formed with this pattern, ("lace", "race", 
"place"). Such an approach requires the ability to blend letter sounds 
into words. 



^he sequence of the first six books is roughly comparable to CTC. 
However, much additional material such as stcries and word studies arc 
added in books five and six. The pacing is indeterminate since each 
child supposedly proceeds at his own speed, although MLS seems to be 
slower than that of CTC. CTC may present several deductive patterns 
simultaneous Ly , but MLS will present only patterns. The MLS program tends 
to present sounds In units. For example, the ha rd and soft sound of "c" 
and "g" are presented in sequence, the three sounds of "oo" are In se- 
quence, the three sounds of "es" are presented in sequence. "Bv" and 
"ew" representing the same sounds are presented In sequence. There is 
some review provided; it appears in Large but irregular intervals. 



Books seven, eight, nine, and ten of MLS present many lessons in 
reading. Several, listed under "American Scene" have very practical ap- 
plications such as reading labels, newspapers , magazines, etc. There is 
also extensive vocabulary udy in "word study." In addition, the following 
topics, which may be considered word recognition skills are treated in 
detail: book seven--compound words, prefixes and suffixes, syllabifica- 

tion; and book eigh t--synonyms , antonyms, and homonyms. 




This review of the content and method of the programs provides a 
t^is for determining an appropriate domain for evaluation. This review 
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must also be placed In the context of the avowed goals and limitations of 
the programs as presented by the publishers. 

1. Both programs are Intended for use with "non-patho logical" 
remedial readers. Only CTC Is more restrictive with Its 
focus on readers with only "decoding 11 difficulties. 

2. Both programs are Intended for use with children in the middle 
grades, l.e., five through nine. 

3. Both programs are intended to be used by the classroom teachers. 

4. Both programs lack detailed placement or diagnostic proce- 
dure for use with the programs. 

5. Both programs are Introduced to teachers primarily through an 
accompanying teacher's manuaL. Orientations given by sales 
representatives are 30 to 90 minutes long and focus on explaining 
the manual. 

6. Both programs are designed to be supplementary, In that they 
are not Intended as the sole material to be used for reading 
or language skill Instruction. 

Instruction and the Schools 



The bsslc question which must be answered here Is, are there any prop- 
erties o i schools which can mediate the Influence of the Instructional pro- 
gram? Certainly there Is a non-trlvlal problem In specifying which proper- 
ties are truly associated with schools as units versus simply aggregate 
qualities of pupils In the schools. Correlations between average student 
I.Q., say, and average teacher salary or education level, need not Imply 
a reductibility of one to the other. Important characteristics of neigh- 
borhoods, which give rise to both average student I.Q, level and to teacher 
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salary level, can be sources of influence common to these conceptually 
independent phenomena and therefore lead to non-zero correlation between 
them. 

Scriven (1967) makes the point that whenever a set of materials or 
an instructional program is used in the classroom, the program itself is 
not only realized, but is incorporated into the entire instructional 
sequence which a teacher implements. Thus, the instructional program for 
the students consists of the materials In the hands of the teacher . 
Furthermore, the general instructional activity of teachers is a part of 
the educational practices of the teacher's school or school system. 
Therefore, one would expect Instructional practices of teachers to differ 
in association with relevant differences among schools. Finally, the 
single, most pervasive property which can be associated with schools is 
its socioeconomic status as a unit. Primarily financial, but also con- 
comitant educational and occupational, attributes of the neighborhoods in 
which schools operate determine and constrain in various ways the 
educational practices of the local school. 

The fact that the MLS materials were originally developed for use 
in a midwest industrial town which is noted for its poverty and illiteracy, 
leads us to anticipate that this program may have a greater effectiveness 
in poorer rather than wealthier schools. Conversely, the CTC materials 
are derived from materials which have had a good deal of success in sub- 
urban school systems. It appears to be a reasonable question as to 
v?heth*r CTC will be as effective as MLS in poorer schools. For both of 
these programs, the possibility of differential effectiveness is based on 
considerations of the practices and resources of the schools themselves. 

In poorer schools, it is not simply that they may have a larger number of 
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deficient readers, but It Is that they have almost no resources , either 
In staff or equipment, to deal with these remedial children. A single 
visit to a wealthy schooL system cao demonstrate the extensiveness of 
resources available there for special problem students. These differ* 
ences among schools make up significant behavioral systems In which new 
materials are utilized. 

StudenV Characteristics and Instruction 

The socioeconomic properties of schools, it has been mentioned, are 
associated with the aggregate properties of students. The major Inves- 
tigations of socioeconomic status and educational variables ha /e considered 
the individual pupil as the unit of study. Jensen (1969) seated, "The 
relationship between SES and IQ constitutes one of the most substantial 
and least disputed facts In psychology and education." Furthermore, 
Whiteman and Deutsch (1968) found substantial correlations between socio- 
economic status and reading performance. Their findings include the well 
known substantial correlation between reading performance and IQ, and 
therefore, the concomitant joint association of these two variables with 
SES. These results are all based on individual pupil characteristics. 

Although it is conceptually problematic, it iu fortunate on the prac- 
tical level that controls for SES properties of schools implicitly control 
for SES properties of pupils. The conceptual problem centers around the 
determination of which agent, school vs. pupil, is the basic or primary 
vehicle for the influence of SES on program *f feet iveness . However, again 
on the practical level, tills conceptual problem may in fact not be a rele- 
vant problem. American society is such that pupil and school, via at least 
a common neighborhood, have highly similar SES qualities. The * ,.iificant 
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implication is that any instructional program, for use by cLassroom teachers, 
wiLL invariably be inserted into a classroom situation in which these SES 
properties are jointly in effect. Thus, evaluation of program effective- 
ness can yield sufficient information by simply treating school-plus- 
stude c as a functional unit, i,t. ignoring the issue about which is more 
"important . " 



There is an additional variable of students which is relevent here. 

The variable of age, or more direct ly, grade cf the student is important 
because the materials are intended for use with children who are beyond 
grade four. Such a wide domain of use forces one to question the uniform- 
ity of program effectiveness over grade levels. In the first place, 
deficient readers in the higher grades (above grade 6) have not only failed 
more but may have developed quite different strategies for dealing with 
their problem than v heir younger counterparts. Also, the effects of 
repeated failure on attitudes and motivations of older students certainly 
cannot be ignored in remedial instruction. Thirdly, the cognitive struc- 
tures which students bring to bear in new learning experiences certainly 
should be expected to differ by grade level. Gagne (1968) has outlined 
alternative ways in which these differences can arise and effect instruc- 
tional success, and Cronbach and Snow (1969) have described a phenomenon 
which is related to this issue, the Apt itude-by-Trea traent Interaction (ATI). 

The Population: Range of Application 




It must be quite explicitly realized that the essential goal of com- 
mercial materials evaluation is to investigate the ef fee t iveness of programs 
as they ar e r.orraalT ' to be used , Tneie are two pritsary attributes of an 
evaluation study in this regard. First, the "treatment" or program 
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administration which is realized in the study must be directly related, 
i.e., highly similar, to the programs as they will be administered in the 
normal, non-research setting* Second, the results of the research must 
be applicable to as wide a range of potential consumers as possible. 

These two facets of the inferential goals of evaluation were accom- 
plished in this study by: 

1. Simulating in the study, as thoroughly as possible, the normal 
process of materials introduction and usage as obtains in the 
commercial setting; and 

2. Specifying a population of schools from which c. true random sample 
could be drawn for participation in the study. 

Both of these procedures are described in the procedure section of 
this document. The point to be made here Is that without both of these 
procedures the results or inferences of a, a evaluation study will be of 
limited value because: 

1. The nature and conditions of program administration will not 
be the same, or highly similar, between research and actual 
usage; 

2. The kinds of school/pupil milieus or situations in which the pro- 
grams have certain effects will not be practically specifiable 
and genera lizable to potential consumers. 

Sutanariz ng the above considerations for the evaluation of the MLS 
and CTC remedial reading programs, the following decisions were made 
about the research design. 

I. Randomization, i.e., true experimentation, would be used for 
program assignment to classrooms. 
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2. The program materials would be studied (used) in actual classroom 
situations, accompanied by the same procedures used by the publishers 
with norma! consumers, 

3. The socioecoi omic status of schools, and thereby pupils, would be 
studied in relation to program effectiveness. 

4. The grade level of students using the materials would be studied 
in relation to program effectiveness. 

5. True random sampling of schools from a specified population 
(sampling frame) would be done. 
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II . Procedure 
Design of the Study 

The two remedial reading programs evaluated In this study were the 
Mott Seal* Prog rammed Series In Language Skills (MLS) , and Cracking 
the Code (CTC) . Both programs utilize the linguistic approach to reading 
Instruction and are Intended for use In the fourth through sixth grades 
and up. The two programs differ In mode of presentation. The MLS employs 
a programmed instruction format for word-attack skills and comprehension. 
The CTC, on the other hand, relies solely on teacher-guided word-attack 
(decoding) exercises and utilizes prose-reading solely for practice. 

Neither the MLS nor the CTC are claimed to be Innovations lr k the teach- 
ing of reading. Both programs involve principles (e.g., linguistic 
approach and programmed format) present in other currently available read- 
ing programs. However, little research substantiating the effectiveness 
of these principles has thus far appeared In the literature. 

The design of this study has two distinct parts. The first Involves 
the selection of the schools and classrooms for participation In the study. 
The second Involves the assignment of treatments or materials to the class- 
room. 

The classrooms actually used In this study were obtained by a process 
of sampling known as stratified random sampling. From a lletof 250 
communities and Chicago neighborhoods published by the Chicago Association 
of Commerce and Industry, the major incorporated areas (and neighborhoods 
within Chicago) In the Standard Metropolitan Statistical Area of Chicago 
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were divided Into three groups based on the median family Income, average 
home value, and assessed property valuation of each area. The three 
groups were, for our purposes, labeled or defined as socioeconomic 
levels high, middle, and low. Separately within each of these groups of 
83 areas or neighborhoods, 18 areas were randomly selected for contact. 
The goal was to obtain six areas at each SES level for Inclusion In the 
study. Fortunately, each area was served by a single school district, 
and It was these concomitant school districts that were contacted for 
participation. 

The second part of this study design Involved randomly assigning 
the treatment conditions to classrooms within each district. It was 
generally the case that most schools have only two classrooms at each of 
the middle grade levels, l.e., fifth, sixth and seventh grades. Because 
of our desire to use classrooms from the same school, and In general hav- 
ing onLy two classes at each grade level, the design chosen Involved 
assigning only two of our three materials conditions (this Includes a 
control) within each grade within each school district. Since we wanted 
to study the effects of both grade and SES on treatment effectiveness, 
we adopted a plan for randomly assigning two treatment conditions which 
balanced the Influence of grade, SES, and treatment over each other. 

The design Is best represented by Table 1, and Is a partially balanced 
Incomplete block (PBIB) design (Kempthorne, 1952), There are over 2500 
£s Included In this study. 
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Testing 

The cLassrooas chosen for study were administered our reading test 
battery in the classroom as a group. The tests 

were all group tests and designed for administration by non-specialists 
in either the field of reading or psychological testing. The battery 
generally requited three hours of classroom time for administration, 
with a break given to the students about halfway through the battery. 

Ail of the pretests were administered by staff members at the Industrial 
Relations Center, The posttesting was done primarily by Industrial 
Relations Staff, but approximately one-fourth of the classrooms were 
tested by the classroom teacher. Care was taken to spread tte teacher- 
tested classrooms over SES levels and treatments* 

Three measurement instruments were used in this study: 

1. The Iowa Silent Reading Teste -Form CM 

2. The Si lent Rending Diagnostic Test s_--Recognltion Technique > 

3. The Letter-Sound Correspondence Tests --Verslon II 

Materials Presentation 

The teachers who were randomly assigned to use either of the re- 
medial materials were given sn orientation to their respective materials 
during the period c£ pretesting. Included is an outline followed by 
the orientors in the general portion of the introduction to the research 
which all teachers ware given. (See Appendix A) 
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TABLE 1 

The Randomized Incomplete Block Experla^taj^DefllR p 
Grade 5 Grade 6 Grade 7 

Claes- Class- Class- Class- Class- Class- 



SRS I room l room 2 room l room 2 ... room l roouL 2 


District 1 


SRA 


Mott 


Contro 1 


SRA 


Control 


2 Mott 


SRA 


SRA 


Conti,* * T 


Mott 


Control 


3 Mott 


Control 


Mott 


srX 


SRA 


Control 


4 Mott 


Control 


SRA 


Contro 1 


Mott 


SRA 


5 SRA 


Control 


Mott 


SRA ~ 


Mott 


Contro 1 


6 SPA 


Conttol 


Mott 


Control 


Mott 


SRA 


SB S II 
District 

7 Mott 


SRA 


Mott 


Contro l 


SRA 


Contro l 


8 Mott 


SRA 


SRA 


Control 


Mott 


Control 


9 Mott 


Control 


Mott 


SRA 


SRX 


Contro 1 


10 Mott 


Contro l 


SRA 


Control 


Mott 


5rX " ’ 


11 SRA 


Contro l 


Mott 


SRA 


Mott 


Control 


12 SRA 


Control 


Mott 


Contro L 


Mott 


SRA 



SES III 
District 



13 Mott 


SRA 


Mott 


Control 


SRA 


Contro 1 


14 Mott 


SRA 


fRA 


Control 


Mott 


Control 


15 Mott 


Control 


Mott 


SRA 


srX~ 


Contro t 


16 Mott 


Coutro l 


SRA 


Contro 1 


Mott 


SRA 


17 SRA 


Contro 1 


Mott 


srX “ 


Mott 


Cc ntro l 


18 SRA 


Control 


Mott 


Contro 1 


Mott 


SRA 
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The main emphasis In the materials orientation given to the teachers 
was a description of the materials, how a teacher was to use the materials 
and, most Important, a review of the teachers' manual and how It was to be 
used. It was not one of the goals or practices tc present to the teachers 
a theory or new concept of teaching reading to poor readers. Our main goal 
was to get the teachers into the manuals and help them with questions* It 
was expected, or hoped, that the manuals would carry the primary burden 
of teacher instructions. We stressed to the teachers that they were to 
contact us If they wanted assistance and also that we would followup with 
them In January. Finally, since no placement or diagnostic procedures 
accompanied the materials, the teachers were lustructed to use tha materials 
with any student they decided, b/ whatever means, might benefit from the 
Instruction. 
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III. Results 

The nature of the assignment of the reading materials to students in 
thia study waa based on the classroom as an administrative teaching unit. 
The grouping of students into classes for instruction will usually be 
reflected in similar performance among children in the same class. Thl* 
similarity of performance of students, as grouped by classrooms, must be 
directly accounted for in the analysis of the effects of the programs 
being studied. Thus, instead of there being 2,500 observations for 
analysis of program e ffect.s in this study, i.e. the number of pupils 
measured, there are only 124 observations, l.e. the number of different 
c lassrooms actually measured. 

Wiley and Bock (1967) provide relevant data as well as a rationale 
for treating the classroom as a unit of analysis, and the reader Is 
referred to that paper for a more thorough elaboration of the strategy. 
The primary goal of the analysis reported here ia to assess the perfor- 
mance effects of the two reading programs. There are four general as- 
pects of the results presented here: 

1. Description of the tnea3ures; 

2. Distribution of program usage; 

3. Analysis of program effects by independent variables, 
e.g. main effects and interactions 

4. Anslysis of program effects by dependent variables, e.g. 
over skills. 






L , Description of the Measures 



The standard deviations and reliabilities for the following six 
scores are based on the pooled withln-c lassroona variability, e,g. the 
student’s score minus the average for his classroom* These are presented 
lu Table 2. 

2 . Distribution of Program Usage 

In the procedure section, It was pointed out that the classroom teach- 
ers were assigned one of the two reading programs* They were free to 
determine the extent of use of the materials In their own classroom. 

This teacher option resulted In the frequencies of actual student parti- 
cipation In the program which are presented In Table 3. 

These frequencies show two phenomenon* First, there is a greater 
usage of materials In the tower economic group than in the higher, a not 
too surprising result. Second, there Is a greater use of the Mott (MLS) 
materials than the SRA materials within similar classroom categories. 

This may have arisen from the apparent differential participation of the 
teacher In using the materials, with the MLS being semi -programmed. 

3. Program Et fects-Overal L Multivariate Comparisons 

The preceding data on the differential usage of the program materials 
does not In Itself complicate analysis of performance differences. How- 
ever, the fact that for both groups of program cLassrooms there were 
some students within Individual classrooms who did not use the materlsls, 
while some students In the same classroom did use them, Is 6oraewhac problem- 
atic. The use of the classroom as a unit of analysis for comparing 
ptograra effects usually rests on the fact that all of the students In the 
classroom are treated similarly with respect to the Instructional 
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TABLE 2 

STANDARD DEVIATIONS and reliability COEFFICIENTS 



OF THE RESPONSE MEASURES 



Pretest 



Posttest 



Number of 
Items 





S.D. 




S »D. 


Xjg. 




Letter-to-Sound Test 


8,55 


.872 


7.78 


,858 


50 


Syllabication & Root Word 


6, 52 


.842 


5.88 


.331 


54 


Tests 












Sound-to-Letter Tests 


il.50 


.86 V 


10.53 


.884 


120 


Paragraph Comprehension 


10,01 


.873 


11.37 


.900 


90 



Tests 



Vocabulary Tests 


6.49 


.759 


7.11 


.805 


54 


Sentence Meaning Test 


3.51 


.532 


3.96 


.710 


27 
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TABLE 3 

NUMBERS OP PUPILS WHO RECEIVED THE MATERIALS 
FOR EACH GRADE IN EACH 
SOC 10 ECONOMIC LEVEL 



SES GRADE 

5 

1 6 

7 

Sum 

5 

2 6 

7 

Sum 

5 

3 6 

7 

Sum 
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programs being studied. This It not the case here. An analysis which 
old, nevertheless, average the acores of all students within a classroom, 
and thereby Ignored actual usage patterns , could obscure the detection 
of actual program effects. 

The analytic procedure which was chosen here attempted to Incorporate 
both the classroom ns the basic unit and the fact that there are within 
classroom treatment differences. The technique used to do this Involved 
doubling the number of measurements for each classroom. The two sets of 
measures associated with each classroom conclsted of the pre- and posttest 
averages for, first, the group of students who did not receive the instruc- 
tional materials and, second, the groitp of students who did receive the 
materials. Thla results In a 24 element response vector for each class- 
room. made up of the six reading scores for both pre- and poattests each 
for both treatment sub-groupa within the classroom. This allows for the 
fact that the performance of the two groups of students are correlated as 
a reault of their being In the same classroom. Tests of significance In 
an analysis of variance will thereby not be Invalidated because the error 
covariance matrix can reflect the Intraclaas correlation among the sub- 
group scores. 

There are two facets of the program effects which can be readily 
examined ualng this arrangement of Che data. The flrat Involves comparing 
the performance of only the students who received materials across class- 
room factora, e.g. grade or SES by program type Interactions. The 
second set of comparisons Involves examining the within claasroom differences 
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between the treatment subgroups and studying the relative differences 
over classroom factors. 

The multivariate F-ratios for the ANOVA corresponding to various treat* 
ment effects are presented in Tables 4 aud 5. The incomplete and par- 
tially balanced nature of the research design renders the effects cor- 
related. The order in which the F-ratios are performed, since the size 
of the mean squares are effected, is important. Also, there are a dif- 
ferent error terma for various sources of variance. The following struc- 
ture was used for the ANOVA here. All terras are bas^d on the elimination of 
preceding sources of variance. 



Source df 

Grand Mean 1 

(A) SES 2 

School (error for A) 18 

(B) Grade 2 

School x Grade (Lin) (error for B) 18 

(C) Treatment 2 

School x Treatment (error for C) 18 

(D) SES x Treatment (KLS-SRA) 2 

Grade x Treatment (MLS-SRA) 2 

SES x Grade x Treatment (LKS-SRA) 4 

Residual (error for D) 43 



Table 4 shows the F-ratios for the vector contrasts of posttest messures, 
corrected for pretests, for the MLS versus SRA program comparisons. 

Single degree of freedom comparisons are presented rather than pooled tests. 
The last F-ratio, SES (Quad) x Grade (Qu3d) x Program, is found U be 
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significant. This would imply that the comparative effects of the 
programs depends on the Grade and SES levels of the classrooms in which 
the programs are used. This will be examined in more detail below. 

Table 5 shows the F-ratios for the vector contrasts based on the within 
classroom subgroup differences in posttest performance, adjusted for 
the pretest differences. This table it?dicates that only the overall 
program differences are of importance to performance. 
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